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Abstract 

Face recognition (FR) is an important task in pattern recognition and com- 
puter vision. Sparse representation (SR) has been demonstrated to be a 
powerful framework for FR. In general, an SR algorithm treats each face in 
a training dataset as a basis function, and tries to find a sparse represen- 
tation of a test face under these basis functions. The sparse representation 
coefficients then provide a recognition hint. Early SR algorithms are based 
on a basic sparse model. Recently, it is found that algorithms based on 
a block sparse model can achieve better recognition rates. Based on this 
model, in this paper we use block sparse Bayesian learning (BSBL) to find 
a sparse representation of a test face for recognition. BSBL is a recently 
proposed framework, which has many advantages over existing block-sparse- 
model based algorithms. Experimental results on the Extended Yale B and 
the AR face database show that using BSBL can achieve better recognition 
rates and higher robustness than state-of-the-art algorithms in most cases. 
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1. Introduction 



Owing to the rapid development of network and computer technologies, 
face recognition plays an important role in many applications, such as video 
surveillance, man-machine interface and so on. Many methods have been 
developed over the past two decades [H, [3, S 0, 01 ■ Basically, face recognition 
is a typical problem of classification. 

In a typical face recognition system, besides the face detection and face 
alignment, there are two main stages in the process of face recognition. 
One is feature extraction, which projects a face image to a low dimensional 
subspace. Because of the huge size of face images, it is desired to extract 
features from each face image, which have lower dimensions and facilitate 
recognition. Lots of feature extraction methods have been proposed, such 
as PCA [3^, LPP [5'] and LDA \^ and so on. Another stage is classification, 
which builds a classification model and assigns a label to a test face image. 
There are many classification algorithms. Typical algorithms are Nearest 
Neighbor (NN) ML Nearest Feature Subspace (NFS) [a] and Support Vector 
Machine (SVM)lg|. 

Recently, Wright et al. proposed a novel face recognition method called 
Sparse Representation Classification (SRC) [13] • In this method, face images 
in the training set form a dictionary matrix (each face image is vectorized 
and forms a column of the dictionary matrix) , and then a vectorized test face 
image is represented under this dictionary matrix. The representation coef- 
ficients provide hints for recognition. For example, if a test face image and 
a training face image belong to the same subject, then the representation 
coefficients of the vectorized test face image under the dictionary matrix are 
sparse (or compressive), i.e., most coefficients are zero (or close to zero). For 
each class (i.e., the columns in the dictionary matrix which are associated 
with a subject), one can calculate the reconstruction error of the vectorized 
test face image using these columns and the associated representation coef- 
ficients. The class with the minimum reconstruction error suggests the test 
face image belongs to this class. More frequently, one uses a feature vector 
extracted from a face image, instead of the original vectorized face image, 
in this method. SRC is robust and it can also achieve good performance in 
occlusion and noise environments. 

Following the idea of SRC, a number of SRC related recognition meth- 
ods have been proposed. Gao et al. extended the basic SRC method to 



a kernel version 111]. Yang et al. proposed a face recognition method via 



sparse coding which is much more robust than SRC in occlusion, corrup- 
tion and disguise environments [l3]- Some other works improved the basic 
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SRC method using weighted sparse representations [IJ] , Gabor feature based 
sparse representations uM, dimensionahty reduction [15|, locally adaptive 
sparse representations [lo] and supervised sparse representations [TtI]. 

Recently, it is found that using algorithms based on a block sparse 



model [18|], instead of the algorithms based on the basic sparse represen- 
tation model, can achieve higher recognition rates in face recognition [19I ]. 
However, these algorithms ignore intra-block correlation in representation 
coefficients. The existence of intra-block correlation in representation coef- 
ficients results from the fact that training face images with the same class 
as a test face image are all correlated with the test face image, and thus 
the representation coefficients associated with the training face images are 
not independent. In sparse reconstruction scenarios it is shown |2Q|] that 
exploiting the intra-block correlation can significantly improve algorithmic 
performance. 

In this study we use block sparse Bayesian learning (BSBL) [i^] to es- 
timate the representation coefficients. BSBL has many advantages over 
existing block-sparse-model based algorithms, especially it has the ability 
to exploit the intra-block correlation in representation coefficients for better 
algorithmic performance. Experimental results on the Extended Yale B and 
the AR databases show that BSBL achieves better results than state-of-the- 
art SRC algorithms in most cases. 

The remainder of this paper is organized as follows. We provide a brief 
review of the original face recognition via sparse representation in Section 
2, and sparse Bayesian learning in Section 3. The Block Sparse Bayesian 
Learning approach for face recognition is proposed in Section 4. Experi- 
mental results are reported in Section 5. Conclusion is drawn in the last 
section. 

2. Related vi^ork 

2.1. Face recognition via sparse representation 

We first describe the basic SRC method for face recognition. Given 
training faces of all K subjects, a dictionary matrix is formed as follows 

[*l,*2,...,*i^] (1) 

where = [vj^i, Vj^2, • • • , Vj^„.] G ]^mxni^ ^^^^ jg ^.j^g f^^^g [1] q£ 
i-th subject. Then, a vectorized test face y G M™^^ is represented under 



For simplicity, we describe v^j as a vectorized face image. But in practice, Vij is a 
feature vector extracted from the face image, as done in our experiments. 
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the dictionary matrix as follows 



y = *x 

= vi, 1X1,1 H h vi,„^xi,„i H h Vi_i,„._jXj_i,„._^ 

+Vi,lXj_i + Vi,22;j,2 H h Vi,„^Xi,„^ 

+Vi+l,lXi+i,l + • • • + VK,nK^K,nK (2) 

where x = [xi^i, • • • , Xj^i, • • • , Xi^^, ■ ■ ■ , xk^ukY is the representation coeffi- 
cient vector. In the basic SRC method, it is suggested that if the new test 
face y belongs to a subject in the training set, say the i-th subject, then un- 
der a sparsity constraint on x, only some of the coefficients Xj i, Xi^2^ * * * i Xi^^i^ 
are significantly nonzero, while other coefficients, i.e. xj^^ij h'^k), are 
zero or close to zero. 

Mathematically, the above idea can be described as the following sparse 
representation problem 

xq = argmin ||x||o s.t. y = $x (3) 

X 

where |lx||o counts the number of nonzero elements in the vector x. Once 
we have obtained the solution xq, the class label of y can be found by 

i = argmin ||y - *(5j(xo)||2, (4) 

j 

where 5j(x) : is the characteristic function which maintains the 

elements of x associated with the j-th class, while sets other elements of x 
to zero. 

However, finding the solution to ([3]) is NP-hard 2]|. Recent theories in 
compressed sensing i^, 23] show that if the true solution is sparse enough, 
under some mild conditions the solution can be found by solving the follow- 
ing convex-relaxation problem 

xi = argmin ||x||i s.t. y = $x. (5) 

X 

Further, to deal with small dense model noise, the problem ([5]) can be 
changed to the following one 

xi = argmin ||x||i s.t. ||y — ^x||2 < e (6) 

X 

where e is a noise-tolerance constant. Many ^i-minimization algorithms can 
be used to find the solution to ([5]) or to (l6|), such as LASSO [2j] and Basis 
Pursuit Denoising [251 ]. 
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In a practical face recognition problem, the coefficient vector xi (or xq) 
is not only sparse but also block sparse. To see this, we can rewrite the 
sparse representation problem ([2]) as follows 



K 



y = *X = [*1,*2, •••,*ir]x = ^*jXj (7) 



where Xj G is the coefficient vector associated with the j-th class, and 

X = [x^, • • • ,x^]-^. When a test face y belongs to the j-th class, ideally 
only elements in Xj are significantly nonzero. In other words, only the block 
Xj has significantly nonzero norm. Clearly, this is a canonical block sparse 



model [ig, |26}]. Many algorithms for the block sparse model can be used 



here. For example, in [10] it is suggested to use the following algorithm: 

K 

X2,i = argmin^ |lxj||2 s.t. ||y - *x|l2 < e (8) 

i=i 

This is a natural extension of basic ^i-minimization algorithms, which im- 
poses (.2 norm on block elements and then norm over blocks. It has been 
shown that exploiting the block structure can largely improve the estimation 
quality of 0, 

However, one should note that when the test face belongs to the j-th 
class, not only the representation coefficient block Xj is a nonzero block, but 
also its elements are correlated in amplitude. The correlation arises because 
the faces of the j-th class in the training set are all correlated with the test 
face, and thus the elements in Xj are mutually dependent. It is shown that 
exploiting the correlation within blocks can further improve the estimation 
quality of xq 23, 22 1 than only exploiting the block structure. 



Therefore, in this study we propose to use block sparse Bayesian learning 
(BSBL) [20] to estimate xq by exploiting the block structure and the cor- 
relation within blocks. In the next section we first briefly introduce sparse 
Bayesian learning (SBL), and then introduce BSBL. 



3. SBL and BSBL 

SBL [sS] was initially proposed as a machine learning method. But later 
it has been shown to be a powerful method for sparse representation, sparse 
signal recovery and compressed sensing. 
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3.1. Advantages of SBL 

Compared to LASSO-type algorithms (such as the original LASSO al- 
gorithm, Basis Pursuit Denoising, Group Lasso, Group Basis Pursuit, and 
other algorithms based on £i-minimization), SBL has the following advan- 



tages [3l|, l3i 



1. Its recovery performance is robust to the characteristics of the matrix 

while other algorithms are not. For example, it has been shown that 
when columns of $ are highly coherent, SBL still maintains good per- 
formance, while other algorithms such as LASSO or other algorithms 
based on convex relaxation have seriously degraded performance [sst ]. 
This advantage is very attractive to sparse representation and other 
applications, since in these applications the matrix $ is not a random 
matrix and its columns are highly coherent. 

2. SBL has a number of desired advantages over many popular algorithms 
in terms of local and global convergence. It can be shown that SBL 
provides a sparser solution than Lasso- type algorithms. In particu- 
lar, in noiseless situations and under certain conditions, the global 
minimum of SBL cost function is unique and corresponds to the true 
sparsest solution, while the global minimum of the cost function of 
LASSO-type algorithms is not necessarily the true sparsest solution 



34l |. These advantages imply that SBL is a better choice in feature 
selection via sparse representation [sH]. 
3. Recent works in SBL [sG^, ^23] provide robust learning rules for auto- 
matically estimating values of its regularizer (related to noise variance) 
such that SBL algorithms can achieve good performance. In contrast, 
LASSO-type algorithms need users to choose values for such regular- 
izer, which is generally obtained by cross-validation. However, this 
takes lots of time for large-scale datasets, which is not convenient and 
even impossible in some scenarios. 



3.2. Introduction to BSBL 

BSBL ^] is an extension of the basic SBL framework, which exploits a 
block structure and intra-block correlation in the coefficient vector x. It is 
based on the assumption that x can be partitioned into K non-overlapping 
blocks: 

X = • • • • • • ,XK,1, ■ ■ ■ ^Xk^tikV (9) 
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Among these blocks, few blocks are nonzero. Then, each block Xj G M"'^-*^ 
is assumed to satisfy a parameterized multivariate Gaussian distribution: 

p(xi;7„Bi) ~AA(0,7,Bi), i = l,---,K (10) 

with the unknown parameters 7j and B j . Here 7j is a nonnegative parameter 
controlling the block-sparsity of x. When 7j = 0, the i-th block becomes 
zero. During the learning procedure most 7j tend to be zero, due to the 
mechanism of automatic relevance determination [s^. Thus sparsity at the 
block level is encouraged. Bj G '^dixdi ^ positive definite and symmet- 
rical matrix, capturing the intra-block correlation of the i-th block. Under 
the assumption that blocks are mutually uncorrelated, the prior of x is 
p(x; {7i,Bi}i) ~ 7\A(0,So), where = diag{7iBi, • • • To avoid 

overfitting, all the Bj will be imposed by some constraints and their esti- 
mates will be further regularized. The model noise n = y — $x is assumed 
to satisfy p(n; A) ~ A/'(0, AI), where A is a positive scalar to be estimated. 
Based on the above probability models, one can obtain a close-form poste- 
rior. Therefore, the estimate of x can be obtained by using the Maximum- 
A-Posteriori (MAP) estimation, providing all the parameters A,{7j,Bj}^^ 
are estimated. 

To estimate the parameters A, {7^, Bj}^^, one can use the Type II max- 



imum likelihood method [37|, |30|] . This is equivalent to minimizing the fol- 
lowing cost function 

C{Q) ^ -21ogyp(yix;A)p(x;{7j,Ba,)dx 

= log|AI + *So*^I + y^(AI + *So*^)-V, (H) 

where Q denotes all the parameters, i.e., O = {A, {72, Bj}^^}. There are 
several optimization methods to minimize the cost function, such as the 
expectation-maximum method, the bound-optimization method, the duality 
method and so on. This framework is called the BSBL framework. 

BSBL not only has the advantages of the basic SBL listed in Section [3.11 
but also has another two advantages: 

1. BSBL provides large flexibility to model and exploit correlation struc- 
ture in signals, such as intra-block correlation [20.. ,29.] . By exploiting 
the correlation structures, recovery performance is significantly im- 
proved. 



2. BSBL has the unique ability to find less-sparse [38[] and non-sparse [2; 
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true solutions with very small errors Q This is attractive for practical 
use, since in practice the true solutions may not be very sparse, and 
existing sparse signal recovery algorithms generally fail in this case. 

Therefore, BSBL is promising for pattern recognition. In the following 
we use BSBL for face recognition. Among a number of BSBL algorithms, 
we choose the bound-optimization based BSBL algorithm [2^, denoted by 
BSBL-BO |. 

4. Face recognition via BSBL 

As stated in Section 2, we use BSBL-BO to estimate xq, denoted by 
xbsbl, and then use the rule ([4]) to assign a test face y to a class. 

In practice, a test face y may contain some outliers, i.e., y = yo + 
where yo is the outlier-free face image and e is a vector whose each entry is 
an outlier. Generally, the number of outliers is small, and thus e is sparse. 
Addressing the outlier issue is important to a practical face recognition 
system. In \ld\, an augmented sparse model was used to deal with this issue. 
We now extend this method to our block sparse model, and use BSBL-BO 
to estimate the solution. In particular, we adopt the following augmented 
block sparse model: 

y = yo + e = *x + n + e 
= [*,I][x^,6T+n 

= *x + n (12) 

where n is a vector modeling dense Gaussian noise, $ = [^,1] and x = 
[x"^, e"^]"^. Here I is an identity matrix of the dimension m x m. Clearly, x 
is also a block sparse vector, whose first K blocks are the blocks of x and 
last m elements are m blocks with the block size being 10. Thus, (I12p is 
still a block sparse model, and can be solved by BSBL-BO. Once BSBL- 
BO obtains the solution, denoted by XgggL, its first K blocks (denoted by 



^Note that for an underdetermined inverse problem, i.e, y — where $ G R™^^ is 
one matrix or a product of a sensing matrix and a dictionary matrix as used in compressed 
sensing, one cannot find the true solution without any error, if the true solution x is non- 
sparse (i.e., ||x||o > m). 

^The BSBL-BO code can be downloaded at http: //dsp.ucsd. edu/~zhilin/BSBL .html) . 

*In experiments we found that treating the m elements as one big block resulted in 
similar performance, while significantly sped up the algorithm. 
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xbsbl) and its last m elements (denoted by e) are used to assign y to a class 
according to 

i = argminlly - e - *(5j(xbsbl) lb (13) 
j 



We now take the Extended Yale B database [39( as an example to show 



how our method works. As shown in SRC [10|], we randomly select half 
of the total 2414 faces (i.e, 1207 faces) as the training set and the rest as 
the testing set. Each face is downsampled from 192 x 168 to 24 x 21 = 
504. The training set contains 38 subjects. Each subject has about 32 faces. 
Therefore, in our model K = 38, and m ~ • • • ~ nx ~ 32. The matrix $ 
has the size 504 x 1207, and thus the matrix $ has the size 504 x 1711. 

The procedure is illustrated in Fig. [TJ Fig. [1] (a) shows that a test face 
(belonging to Subject 4) can be linearly combined by a few training faces. 
Most of the coefficients estimated by BSBL-BO (i.e., xbsbl) are zero or near 
zero and only those associated with the test face are significantly nonzero. 
Fig. [H (b) shows the residuals ||y — *5j(xBSBL)||2 for j = 1, • • • , 38. The 
residual at j = 4 is 0.0008, while the residuals at j 7^ 4 are all close to 1, 
which makes it easy to assign the test face to Subject 4. See 15.1.1] for more 
details. 



5. Experimental results 

To demonstrate the superior performance of BSBL, we performed ex- 
periments on two widely used face databases: Extended Yale B [s^ and 



AR face database [4y]. The face images of these two databases were cap- 
tured under varying lighting, pose or facial expression. The AR database 
also has occluded face images for the test of robustness of face recognition 
algorithms. Section [5.11 shows experimental results on face images without 
occlusion, and Section 15.21 shows experimental results on face images with 
three kinds of occlusion. 

5.1. Face recognition without occlusion 

For the experiments on face images without occlusion, we used downsam- 
pling, Eigenfaces 41], and LaplacicanfacesQ] to reduce the dimensionality 



of original faces. We compared our method with three classical methods, 
including Nearest Neighbor (NN) 0], Nearest Subspace (NS) [i^], and Sup- 
port Vector Machine(SVM) [9]. We also compared our method with recently 
proposed sparse-representation based classification methods, including the 
basic sparse-representation classifier (SRC) [l3] and the block-sparse recov- 
ery algorithm via convex optimization (BSCO) [l^. For NS, the subspace 
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Fig. 1. Face recognition via BSBL. (a) Recognition with 24 x 21 down- 
sampled faces as features. The left picture shows a test face image and 
the downsampled face. The right picture shows the estimated coefficients 
xbsbl- The test face belongs to Subject 4, and thus the representation coef- 
ficients associated with the downsampled training faces of the subject, i.e., 
$4, have large values. Two training faces (and their downsampled faces) 
associated with the two largest coefficients are plotted. The bars near the 
top and the bottom of the box indicate the blocks in the coefficient vector, 
(b) The residuals ||y — ^^^(xbsbl) lb for j = 1, • • • ,38. 



dimension was fixed to 9. For BSCO, we used the P'^^j^^ algorithm |19| | which 
has been shown to be the best one among all the structured sparsity-based 
classifiers proposed in that work. 

5.1.1. Extended Yale B database 

Extended Yale B database consists of 2414 frontal-face images of 38 
subjects (each subject has about 64 images). In the experiment, we used 
the cropped 192 x 168 face images which were captured under various lighting 
conditions [4^. Two subjects are shown in Fig. [2] for illustration (for each 
subject, only 10 face images are shown). We randomly selected half face 
images of each subject as the training set and the rest as the testing set. 
We used downsampling, Eigenfaces, and Laplacicanfaces to extract features 
from face images. The dimensions of extracted features were 30, 56, 120 and 
504 respectively. 

Experimental results are shown in Fig. [3l where we can see our method 
uniformly outperformed other algorithms regardless of used features. Partic- 
ularly, our method had better performance when using Laplacianfaces. The 
superiority of our method was much clearer when the feature dimension was 
smaller and Laplacianfaces were used. For example, when the feature di- 
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Fig. 2. Sample face images of 2 individuals from the Extended Yale B 
database. 1st row: ten sample face images of the first subject. 2nd row: ten 
sample face images of the third subject. 

mension was 56, our method achieved the highest rate of 98.9%, while NN, 
NS, SVM, SRC and BSCO achieved the rate of 83.5%, 90.4%, 85.0%, 91.7% 
and 79.4%, respectively. Higher performance using low dimensional features 
is attractive for recognition, since lower feature dimension generally implies 
the computational load is accordingly lower. 

5.1.2. AR database 

AR database consists of more than 4000 front-face images of 126 human 
subjects. Each subject has 26 images in two separated sessions, as shown 
in Fig. m This database includes more facial expression and facial disguise. 
We chose 100 subjects (50 male and 50 female) in this experiment. For each 
subject, seven face images with different illumination and facial expression 
(i.e., the first 7 images of each subject) in Session 1 were selected for training, 
and the first 7 images of each subject in Session 2 for testing. All the images 
were converted to gray mode and were resized to 165 x 120. Downsampled 
faces, Eigenfaces and Laplacianfaces were applied with the dimension of 30, 
54, 130 and 540. Experimental results are shown in Fig. [5j 

Prom Fig.[5^a), we can see that our algorithm significantly outperformed 
other classifiers when using downsampled features. However, our method 
did not achieve the highest rate when using Eigenfaces and Laplacianfaces. 
This might be due to the small block size in this experiment (ni = n2 = 
• • • = nioo = 7). Although our method did not uniformly outperform other 
algorithms when using different face features, the recognition rate achieved 
by our method using downsampled faces (96.7%) was not exceeded by other 
algorithms using any face features. 

5.2. Face recognition with occlusion 

For the experiments on face images with occlusion, we used downsam- 
pling to reduce the size of face images and compared our method with NN 



11 



Downsampling 



Eigenfaces 




-BSBL 
- NN 
NS 
-SVM 
-SRC 
-BSCO 



100 200 300 400 
Feature Dimension 



100 200 300 400 500 
Feature Dimension 



(a) 



(b) 



Laplacianfaces 




Feature Dimension 



(c) 



Fig. 3. Comparison of recognition rates on Extended Yale B database when 
using different face features, (a) Downsampling faces, (b) Eigenfaces. (c) 
Laplacianfaces. 



B and SRC 0. 



5.2.1. Face recognition with pixel corruption 

We tested face recognition with pixel corruption on 3 subsets of the Ex- 
tended Yale B database: 719 face images with normal-to-moderate lighting 
conditions from Subset 1 and 2 for training and 455 face images with more 
extreme lighting conditions from Subset 3 for testing. For each test im- 
age, we first replaced a certain percentage(0% - 50%) of its original pixels 
by uniformly distributed gray values in [0,255]. Both the gray values and 
the locations were random and hence unknown to the algorithms. We then 
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Fig. 4. Face images of two individuals from AR database. The 1st row: face 
images of the first male subject in Session 1. The 2nd row: face images of 
the first male subject in Session 2. The 3rd row: face images of the first 
female subject in Session 1. The 4th row: face images of the first female 
subject in Session 2. 



downsampled all the images to the size of 6 x 5, 8 x 7, 12 x 10 and 24 x 21 
respectively. 

Results are shown in Table [TJ It can be seen that in all dimensions 
and corruption, BSBL achieved the highest recognition rate, and the per- 
formance gap between our algorithm and the compared algorithms was very 
large. For example, when the dimension was 24 x 21 and 50% pixels were 
corrupted, BSBL achieved the recognition rate of 89.01%, while SRC only 
had a recognition rate of 73.63%. Fig. M^a) shows the recognition rates of 
the three algorithms at different pixel corruption levels when face dimension 
was 24 X 21. 

5.2.2. Face recognition with block occlusion 

In this experiment, we used the same training and testing images as 
those in the previous pixel corruption experiment. For each test image, 
we replaced a randomly located square block with an unrelated image(the 
baboon image in SRC [10]), which occluded 0% - 50% of the original testing 
image. We then downsampled all the images to the size of 6 x 5, 8x7, 
12 x 10 and 24 x 21 respectively. 

Table [2] shows the recognition rates of NN, SRC and BSBL on different 
dimensions and percentages of occlusion. Again, BSBL outperformed the 
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Fig. 5. Comparison of recognition rates on AR database when using different 
face features, (a) Downsampling faces, (b) Eigenfaces. (c) Lapiacianfaces. 



compared algorithms. For example, when the occlusion percentage ranged 
from 10% to 50% and the face dimension was 24 x 21, BSBL achieved about 
0.5%-7.5% higher recognition rate than SRC, as shown in Fig. [6jb). 

5.2.3. Face recognition with real face disguise 

We used a subset of AR database to test the performance of our method 
on face recognition with disguise. We chose 799 images of various facial 
expression without occlusion (i.e., the first 4 face images in each session 
except a corrupted image named 'W-027-14.bmp') for training. We formed 
two separate testing sets of 200 images. The images in the first set were 
from the neutral expression with sunglasses (the 8th image in each session) 
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Tab. 1. Recognition rate on faces with pixel corruption(%) 





UlIlit^IlDlUIl 


Percent corrupted (%) 





10 


20 


30 


40 


50 


NN 


6x5 


36.92 


42.42 


49.67 


46.15 


28.35 


14.95 


8x7 


-18.79 


5-1.95 


GO. 00 


59.34 


10.88 


20.22 


12 X 10 


67.25 


75.17 


79.56 


74.73 


58.02 


35.39 


24 X 21 


87.25 


93.19 


94.95 


92.53 


76.48 


56.04 


SRC 


6x5 


54.51 


44.62 


50.55 


46.59 


32.75 


21.76 


8x7 


82.64 


61.32 


66.59 


63.52 


49.23 


28.79 


12 X 10 


98.02 


85.06 


85.28 


83.96 


71.87 


46.37 


24 X 21 


100.00 


98.24 


98.24 


97.14 


92.09 


73.63 


BSBL 


6x5 


87.25 


85.71 


68.79 


51.43 


30.99 


19.56 


8x7 


94.29 


92.97 


86.15 


72.53 


59.12 


39.34 


12 X 10 


99.56 


99.34 


97.80 


92.31 


84.18 


67.25 


24 X 21 


100.00 


100.00 


99.78 


99.12 


97.58 


89.01 



Tab. 2. Recognition rate on faces with block occlusion(%) 



Method 


Dimension 


Percent occluded(%) 





10 


20 


30 


40 


50 


NN 


6x5 


36.92 


34.29 


27.69 


24.40 


20.44 


15.17 


8x7 


48.79 


44.84 


38.68 


32.09 


21.54 


18.46 


12 X 10 


67.25 


64.18 


52.09 


45.71 


30.33 


22.64 


24 X 21 


87.25 


85.50 


76.92 


67.25 


52.31 


37.14 


SRC 


6x5 


54.51 


36.26 


28.13 


22.64 


17.36 


14.29 


8x7 


82.64 


50.99 


39.56 


31.65 


20.66 


17.36 


12 x 10 


98.02 


75.39 


59.78 


48.57 


30.33 


20.88 


24 X 21 


100.00 


96.48 


89.23 


72.31 


54.29 


35.17 


BSBL 


6x5 


87.25 


46.59 


28.35 


18.68 


11.65 


10.55 


8x7 


94.29 


66.59 


40.88 


26.59 


20.22 


12..53 


12 x 10 


99.5(3 


83.30 


(30.00 


-15.28 


33.19 


22.20 


24 X 21 


100.00 


96.92 


92.31 


75.60 


56.48 


42.64 
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O 10 20 30 40 50 O 1 20 30 40 50 

Percent Gorrupted(%) Percent occluded(%) 



(a) (b) 

Fig. 6. Comparison of recognition rates on faces with different percentage of 
(a)pixel corruption and (b)block occlusion when face dimension was 24 x 21. 



Tab. 3. Recognition rate on faces with disguise(%) 



Dimension 


Sunglasses 


Scarves 


Total 


NN 


SRC 


BSBL 


NN 


SRC 


BSBL 


NN 


SRC 


BSBL 


9x6 


35.00 


46.50 


22.00 


6.50 


10.00 


23.00 


20.75 


28.25 


22.50 


13 X 10 


48.00 


72.00 


40.50 


7.00 


16.00 


46.00 


27.50 


44.00 


43.25 


27 X 20 


65.50 


83.00 


64.00 


9.50 


21.50 


81.00 


37.50 


52.25 


72.50 


42 X 30 


68.00 


89.00 


65.50 


11.50 


37.00 


83.50 


39.75 


63.00 


74.50 



which cover roughly 20% of the face, while the ones in the second set were 
from the neutral expression with scarves (the 11th image in each session) 
which cover roughly 40% of the face. All the images were resized to 9 x 6, 
13 X 10, 27 X 20 and 42 x 30 respectively. 

Results are shown in Table [3l In the case of neutral expression with 
sunglasses, both SRC and NN acheived higher recognition rates than BSBL. 
However, in the case of neutral expression with scarves, BSBL outperformed 
SRC and NN significantly. Totally, BSBL achieved the highest recognition 
rates with the dimensions of 27 x 20 and 42 x 30 for the two testing sets. 

6. Conclusions 

Classification via sparse representation is a popular methodology in face 
recognition and other classification tasks. Previous works generally fo- 
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cused on the use of convex algorithms such as LASSO and mixed ^2/^1- 
minimization algorithms. In this paper we introduced a recently proposed 
block sparse Bayesian learning algorithm for face recognition. The algorithm 
has been shown to have a number of advantages over popular convex algo- 
rithms. Experiments on common face databases showed that the algorithm 
is a promising sparse-representation-based classifier. 
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